KMP (Knuth-Morris-Pratt) String Matching Demo
Efficient pattern search using the Longest Proper Prefix (LPS) array.
š Problem Statement & Input
Problem Statement:
Lily, a software engineer, needs to find all occurrences of a specific pattern in a larger text for her text analysis project. Given two strings $\text{txt}$ (the text) and $\text{pat}$ (the pattern) of sizes $N$ and $M$ respectively, where $N > M$, help Lily by printing all starting indices of the pattern in the text using the KMP algorithm.
**Input:** Text ($\text{txt}$) and Pattern ($\text{pat}$).
**Output:** The starting indices of all occurrences of the pattern in the text (0-based indexing).
Input Section
Initial Visualization:
Text:
Pattern:
š” LPS Array Computation (Preprocessing)
LPS Definition:
The $\text{LPS}[i]$ value is the length of the longest proper prefix of $\text{pat}[0 \dots i]$ that is also a suffix of $\text{pat}[0 \dots i]$. This array dictates the optimal shift distance upon a mismatch.
Pattern ($\text{pat}$) and its LPS Array ($\pi$):
LPS Calculation Summary:
LPS array will be calculated upon loading the input.
š¬ Step-by-Step Pattern Search
Search Visualization
**Pointers:** Text Index ($i$) = ?, Pattern Index ($j$) = ?
Output Log
Found Pattern Indices (0-based): None
š Algorithm Analysis
Time Complexity
O(N + M)
The **LPS array preprocessing** takes $O(M)$ time. The **search phase** takes $O(N)$ time. Since $N > M$, the overall complexity is $O(N+M)$, which is optimal.
Space Complexity
O(M)
Space is required only to store the $\text{LPS}$ array, which has a size equal to the length of the pattern ($M$).
Why KMP is Superior
The Naive algorithm's worst-case time complexity is $O(N \cdot M)$. KMP avoids this by intelligently sliding the pattern based on the pre-computed $\text{LPS}$ array, never comparing the same text character twice. This is particularly crucial for strings with repeating patterns, like $T=\text{AAAAAB}$ and $P=\text{AAAAB}$.